Treebank Conversion Creating a German f-structure bank from the TIGER Corpus

نویسنده

  • Martin Forst
چکیده

This paper reports on the conversion of the TIGER treebank, a syntactically interpreted corpus of German newspaper texts, into a testsuite for a broad-coverage Lexical-Functional Grammar (LFG) for German. It presents the two major steps of the conversion, which consists of an XSLT transformation of the TIGER XML representation into a relational Prolog-like representation and the subsequent application of term-rewriting rules as they are used in certain MT transfer components to that representation. Then some problems due to considerable differences in analysis or to information not encoded in the TIGER representation are discussed. The output consists of (partly ambiguous) f-structure charts, which can then be mapped against the grammar’s output for evaluation purposes.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Towards A Dependency-Based Gold Standard For German Parsers: The TIGER Dependency Bank

In this paper we discuss the construction, features and intended uses of the TiGer DB. The TiGer DB is a dependency bank derived from the TiGer Treebank containing predicate-argument relations and several grammatical features which can be considered as semantically meaningful. It is produced semi-automatically by the conversion of the TiGer treebank into an LFG f-structure bank, which then in t...

متن کامل

On Representing Dependency Relations – Insights from Converting the German TiGerDB

Research in parser evaluation has led to the creation of dependency resources such as the TiGer Dependency Bank, a semi-automatic conversion of a subset of the TIGER Treebank. We explore the relationship between the TiGerDB representation and a more surface-oriented dependency analysis of German and describe how we mapped and recoded the TiGerDB into a format more closely linked to the original...

متن کامل

The TIGER 700 RMRS Bank: RMRS Construction from Dependencies

We present a treebank conversion method by which we construct an RMRS bank for HPSG parser evaluation from the TIGER Dependency Bank. Our method effectively performs automatic RMRS semantics construction from functional dependencies, following the semantic algebra of (Copestake et al., 2001). We present the semantics construction mechanism, and focus on some special phenomena. Automatic convers...

متن کامل

Making Ellipses Explicit in Dependency Conversion for a German Treebank

We present a carefully designed dependency conversion of the German phrase-structure treebank TiGer that explicitly represents verb ellipses by introducing empty nodes into the tree. Although the conversion process uses heuristics like many other conversion tools we designed them to fail if no reasonable solution can be found. The failing of the conversion process makes it possible to detect el...

متن کامل

Projecting RMRS from TIGER Dependencies

We present a method for automatic RMRS semantics construction from dependency structures, following the semantic algebra of Copestake et al. (2001). We have applied this method to a subset of the TIGER Dependency Bank for German (Forst et al., 2004) to obtain a semantic treebank for (HPSG) parser evaluation. We describe the semantics construction mechanism and give evaluation figures from manua...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003